fix(sdk): propagate connector-discovered status to the model row#23
Conversation
Ledger.add() never carried a discovered node's status into the models table: register() was called without status (defaulting to 'active') and early-returns cached refs unchanged, while add() only bumped last_seen before update_model(). A connector deriving a lifecycle status from the source (e.g. 'deprecated' for an entity deleted upstream) saw it land in snapshot payloads only — the model row stayed 'active' forever. add() now reads node.metadata['status'] (the same contract as owner/ model_type/tier/purpose/model_origin), passes it to register() for new models, and assigns ref.status on existing refs when the discovered status differs, before the content-hash dedup check — so the row self-corrects even when the snapshot write is skipped as unchanged. Both Snowflake flush MERGE paths already SET STATUS on match, so existing rows converge on the next sync with no migration. Statuses are validated against ModelStatus (case-insensitive) and normalized; absent or unrecognized values leave the stored status untouched, so an explicit status is never regressed to the default. Tests: SDK-level add() coverage (new-model-with-status, existing-model flip, absent/unknown no-op, dedup-skip self-correction), Snowflake MERGE-path tests with in-memory parity, and a Hypothesis property (status equals the last valid discovered status). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b05f1e9bdf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| added = 0 | ||
| skipped = 0 | ||
| for node in nodes: | ||
| node_status = _normalize_status(node.metadata.get("status")) |
There was a problem hiding this comment.
Accept status metadata case-insensitively
When status comes from sql_connector without an explicit metadata_columns mapping and the DB driver returns uppercase column names (for example Snowflake-style rows with STATUS), the connector preserves the original key in node.metadata (src/model_ledger/connectors/sql.py lines 237-240). This lookup only checks lowercase status, so the new propagation path is skipped and the model row remains active even though the discovered snapshot carries the source status. Consider normalizing metadata keys or looking up status case-insensitively here.
Useful? React with 👍 / 👎.
Problem
Connectors deriving lifecycle status from sources couldn't propagate it to the model registry.
Ledger.add()calledregister()without a status (defaulting toactive), andregister()early-returns cached/existing refs unchanged —add()then only bumpedlast_seenbeforeupdate_model(). So a connector that derivesstatus='deprecated'for an entity deleted at the source saw it land in snapshot payloads only; the model row's status stayedactiveforever.Fix
Ledger.add()now readsnode.metadata['status']— the same contractadd()already uses forowner/model_type/tier/purpose/model_origin, and what the SQL connector's unmapped-columns path already populates (noDataNode/ModelRefAPI change):register().ref.statusis reassigned when the discovered status differs, before the content-hash dedup check — so the row self-corrects even when the snapshot write is skipped as unchanged.ModelStatus(case-insensitive) and normalized; an absent or unrecognized status leaves the stored status untouched, so an explicit status is never regressed to the default.Both Snowflake flush MERGE paths already
SET STATUSon match, so existing rows self-correct on the next sync — no migration needed.Tests
add()coverage: new-model-with-status, existing-model flip, default, absent/unknown no-op, case normalization, flip-records-snapshot, dedup-skip self-correction (tests/test_graph/test_ledger_graph.py)Ledger.add()end-to-end, plus in-memory ↔ Snowflake-SQL parity on the same discovery sequence (tests/test_backends/test_snowflake_ledger.py; the mock session now capturesPURPOSE/STATUSfrom the MERGE source instead of hardcoding them)tests/test_invariants.py)750 passed locally; ruff, mypy, and the coverage gate are clean.
🤖 Generated with Claude Code